Prediction of aqueous solubility of druglike organic compounds using partial least squares, backpropagation network and support vector machine

نویسندگان

  • Dong-Sheng Cao
  • Qing-Song Xu
  • Yi-Zeng Liang
  • Xian Chen
  • Hong-Dong Li
چکیده

Aqueous solubility of drug compounds plays a very important role in drug research and development. In this study, we have collected 225 diverse druglike molecules with accurate aqueous solubility. Three commonly used methods, namely partial least squares (PLS), back-propagation network (BPN) and support vector regression (SVR), were employed to model quantitative structure–property relationship (QSPR) for the aqueous solubility of 180 druglike compounds. Twenty eight molecular descriptors were used to relate the drug aqueous solubility. In order to obtain a reliable and robust aqueous solubility prediction, a novel outlier detection method was employed to simultaneously detect all outliers in the established models. According to the Organization for Economic Co-operation and Development (OECD) principles, the QSPR models were checked by both internal and external statistical validation to ensure both reliability and predictive ability. The results indicate that three models can provide good predictive ability for drug aqueous solubility. Futhermore, it was found that the predictive ability of SVR is superior to those of PLS and BPN and 28 selected molecular descriptors could give a reliable and direct interpretation to the aqueous solubility. Copyright 2010 John Wiley & Sons, Ltd.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prediction of the pharmaceutical solubility in water and organic solvents via different soft computing models

Solubility data of solid in aqueous and different organic solvents are very important physicochemical properties considered in the design of the industrial processes and the theoretical studies. In this study, experimental solubility data of 666 pharmaceutical compounds in water and 712 pharmaceutical compounds in organic solvents were collected from different sources. Three different artificia...

متن کامل

QSAR Prediction of Half-Life, Nondimentional Eeffective Degradation Rate Constant and Effective Péclet Number of Volatile Organic Compounds

In this work some quantitative structure activity relationship models were developed for prediction of three bioenvironmental parameters of 28 volatile organic compounds, which are used in assessing the behavior of pollutants in soil. These parameters are; half-life, non dimensional effective degradation rate constant and effective Péclet number in two type of soil. The most effective descripto...

متن کامل

Least Squares Support Vector Machine for Constitutive Modeling of Clay

Constitutive modeling of clay is an important research in geotechnical engineering. It is difficult to use precise mathematical expressions to approximate stress-strain relationship of clay. Artificial neural network (ANN) and support vector machine (SVM) have been successfully used in constitutive modeling of clay. However, generalization ability of ANN has some limitations, and application of...

متن کامل

Random Forest Models To Predict Aqueous Solubility

Random Forest regression (RF), Partial-Least-Squares (PLS) regression, Support Vector Machines (SVM), and Artificial Neural Networks (ANN) were used to develop QSPR models for the prediction of aqueous solubility, based on experimental data for 988 organic molecules. The Random Forest regression model predicted aqueous solubility more accurately than those created by PLS, SVM, and ANN and offer...

متن کامل

Least-squares support vector machine and its application in the simultaneous quantitative spectrophotometric determination of pharmaceutical ternary mixture

This paper proposes the least-squares support vector machine (LS-SVM) as an intelligent method applied on absorption spectra for the simultaneous determination of paracetamol (PCT), caffeine (CAF) and ibuprofen (IB) in Novafen. The signal to noise ratio (S/N) increased. Also, In the LS - SVM model, Kernel parameter (σ2) and capacity factor (C) were optimized. Excellent prediction was shown usin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010